skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Search for: All records

Creators/Authors contains: "Woodworth Blake"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Free, publicly-accessible full text available July 23, 2026
  2. Free, publicly-accessible full text available March 1, 2026
  3. null (Ed.)
    We present a direct (primal only) derivation of Mirror Descent as a “partial” discretization of gradient flow on a Riemannian manifold where the metric tensor is the Hessian of the Mirror Descent potential function. We contrast this discretization to Natural Gradient Descent, which is obtained by a “full” forward Euler discretization. This view helps shed light on the relationship between the methods and allows generalizing Mirror Descent to any Riemannian geometry in Rd, even when the metric tensor is not a Hessian, and thus there is no “dual.” 
    more » « less
  4. null (Ed.)
    Recent work has highlighted the role of initialization scale in determining the structure of the solutions that gradient methods converge to. In particular, it was shown that large initialization leads to the neural tangent kernel regime solution, whereas small initialization leads to so called “rich regimes”. However, the initialization structure is richer than the overall scale alone and involves relative magnitudes of different weights and layers in the network. Here we show that these relative scales, which we refer to as initialization shape, play an important role in determining the learned model. We develop a novel technique for deriving the inductive bias of gradientflow and use it to obtain closed-form implicit regularizers for multiple cases of interest. 
    more » « less
  5. null (Ed.)
  6. null; null; null; null (Ed.)